Improvements of Voice Timbre Control Based on Perceived Age in Singing Voice Conversion
نویسندگان
چکیده
As one of the techniques enabling individual singers to produce the varieties of voice timbre beyond their own physical constraints, a statistical voice timbre control technique based on the perceived age has been developed. In this technique, the perceived age of a singing voice, which is the age of the singer as perceived by the listener, is used as one of the intuitively understandable measures to describe voice characteristics of the singing voice. The use of statistical voice conversion (SVC) with a singer-dependent multiple-regression Gaussian mixture model (MRGMM), which effectively models the voice timbre variations caused by a change of the perceived age, makes it possible for individual singers to manipulate the perceived ages of their own singing voices while retaining their own singer identities. However, there still remain several issues; e.g., 1) a controllable range of the perceived age is limited; 2) quality of the converted singing voice is significantly degraded compared to that of a natural singing voice; and 3) each singer needs to sing the same phrase set as sung by a reference singer to develop the singer-dependent MR-GMM. To address these issues, we propose the following three methods; 1) a method using gender-dependent modeling to expand the controllable range of the perceived age; 2) a method using direct waveform modification based on spectrum differential to improve quality of the converted singing voice; and 3) a rapid unsupervised adaptation method based on maximum a posteriori (MAP) estimation to easily develop the singer-dependent MR-GMM. The experimental results show that the proposed methods achieve a wider controllable range of the perceived age, a significant quality improvement of the converted singing voice, and the development of the singer-dependnet MR-GMM using only a few arbitrary phrases as adaptation data. key words: statistical singing voice conversion, perceived age, genderdependent modeling, direct waveform modification, unsupervised adaptation
منابع مشابه
Voice Timbre Control Based on Perceived Age in Singing Voice Conversion
The perceived age of a singing voice is the age of the singer as perceived by the listener, and is one of the notable characteristics that determines perceptions of a song. In this paper, we describe an investigation of acoustic features that have an effect on the perceived age, and a novel voice timbre control technique based on the perceived age for singing voice conversion (SVC). Singers can...
متن کاملStatistical approach to perceived age control of singing voice
The perceived age of a singing voice is the age of the singer as perceived by the listener, and is one of the notable characteristics that determines perceptions of a song. Singers can sing expressively by controlling prosody and voice timbre, but the varieties of voice timbre that singers can produce are limited by physical constraints. Previous work has attempted to overcome the limitation th...
متن کاملAn investigation of acoustic features for singing voice conversion based on perceptual age
In this paper, we investigate the acoustic features that can be modified to control the perceptual age of a singing voice. Singers can sing expressively by controlling prosody and vocal timbre, but the varieties of voices that singers can produce are limited by physical constraints. Previous work has attempted to overcome this limitation through the use of statistical voice conversion. This tec...
متن کاملEvaluation of a singing voice conversion method based on many-to-many eigenvoice conversion
In this paper, we evaluate our proposed singing voice conversion method from various perspectives. To enable singers to freely control their voice timbre of singing voice, we have proposed a singing voice conversion method based on many-tomany eigenvoice conversion (EVC) that enables to convert the voice timbre of an arbitrary source singer into that of another arbitrary target singer using a p...
متن کاملApplying voice conversion to concatenative singing-voice synthesis
This work address the application of Voice Conversion to singing-voice. The GMM-based approach was applied to VOCALOID, a concatenative singing synthesizer, to perform singer timbre conversion. The conversion framework was applied to full-quality singing databases, achieving a satisfactory conversion effect on the synthesized utterances. We report in this paper the results of our experimentatio...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- IEICE Transactions
دوره 99-D شماره
صفحات -
تاریخ انتشار 2016